Spejd: A Shallow Processing and Morphological Disambiguation Tool
نویسندگان
چکیده
This article presents a formalism and a beta version of a new tool for simultaneous morphosyntactic disambiguation and shallow parsing. Unlike in the case of other shallow parsing formalisms, the rules of the grammar allow for explicit morphosyntactic disambiguation statements, independently of structure-building statements, which facilitates the task of the shallow parsing of morphosyntactically ambiguous or erroneously disambiguated input.
منابع مشابه
spade Demo: An Open Source Tool for Partial Parsing and Morphosyntactic Disambiguation
The paper presents Spejd, an Open Source Shallow Parsing and Disambiguation Engine. Spejd (abbreviated to ♠) is based on a fully uniform formalism both for constituency partial parsing and for morphosyntactic disambiguation — the same grammar rule may contain structure-building operations, as well as morphosyntactic correction and disambiguation operations. The formalism and the engine are more...
متن کاملSHAKKIL: An Automatic Diacritization System for Modern Standard Arabic Texts
This paper sheds light on a system that would be able to diacritize Arabic texts automatically (SHAKKIL). In this system, the diacritization problem will be handled through two levels; morphological and syntactic processing levels. The adopted morphological disambiguation algorithm depends on four layers; Uni-morphological form layer, rule-based morphological disambiguation layer, statistical-b...
متن کاملKURD - A Formalism for Shallow Post Morphological Processing
In most NLP applications an input text undergoes a number of transformations until the desired information can be extracted from it. Typically, such transformations involve part of speech tagging, morphological analysis such as lemmatization or full derivational and compositional analysis, context dependent disambiguation of tagging results, multi-word recognition, shallow, partial or full synt...
متن کاملITU Treebank Annotation Tool
In this paper, we present a treebank annotation tool developed for processing Turkish sentences. The tool consists of three different annotation stages; morphological analysis, morphological disambiguation and syntax analysis. Each of these stages are integrated with existing analyzers in order to guide human annotators. Our semiautomatic treebank annotation tool is currently used both for crea...
متن کاملShallow Parsing of Czech Sentence Based on Correct Morphological Disambiguation
The basis of such an approach is provided by a very complex and sophisticated rule-based morphological disambiguation which can disambiguate Czech sentence with a very high reliability, i.e. with a minimum number of errors. This is, of course, very important for any language and all the more so for Czech whose ambiguity rate is generally extremely high (as compared e.g. to other Slavic language...
متن کامل